vpage / vregion: a minimal version with a test#10636
vpage / vregion: a minimal version with a test#10636lyakh wants to merge 3 commits intothesofproject:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Introduces a minimal “virtual pages” allocator and “virtual regions” (partitioned into interim + lifetime allocators) for SOF on Zephyr/ACE, along with a basic boot test and the required Kconfig/build/board configuration plumbing.
Changes:
- Add
vpagecontiguous virtual-page allocator andvregionallocator built on top of it. - Wire new sources and configs into Zephyr build + Kconfig (including new virtual region attribute).
- Add a Ztest-based boot test for
vregion, and bump ACE board virtual region count.
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
zephyr/test/vregion.c |
Adds a basic vregion allocator test under boot tests. |
zephyr/test/CMakeLists.txt |
Builds the new test when CONFIG_SOF_VREGIONS is enabled. |
zephyr/lib/vregion.c |
Implements vregion (interim k_heap + lifetime linear allocator). |
zephyr/lib/vpages.c |
Implements vpage allocator and registers its init via SYS_INIT. |
zephyr/lib/alloc.c |
Adjusts virtual heap bundle sizing/comments to match new region sizing. |
zephyr/include/sof/lib/vregion.h |
Public API for vregion. |
zephyr/include/sof/lib/vpages.h |
Public API for vpage. |
zephyr/include/sof/lib/regions_mm.h |
Adds VIRTUAL_REGION_VPAGES_ATTR. |
zephyr/Kconfig |
Adds SOF_VREGIONS, SOF_VPAGE_MAX_ALLOCS, SOF_ZEPHYR_VIRTUAL_PAGE_REGION_SIZE, and updates heap region sizing defaults/wording. |
zephyr/CMakeLists.txt |
Builds vpages.c/vregion.c when CONFIG_SOF_VREGIONS is enabled. |
app/boards/intel_adsp_ace40_nvls.conf |
Increases virtual region count to 3 for VPAGES region. |
app/boards/intel_adsp_ace40_nvl.conf |
Increases virtual region count to 3 for VPAGES region. |
app/boards/intel_adsp_ace30_wcl.conf |
Increases virtual region count to 3 for VPAGES region. |
app/boards/intel_adsp_ace30_ptl.conf |
Increases virtual region count to 3 for VPAGES region. |
app/boards/intel_adsp_ace20_lnl.conf |
Increases virtual region count to 3 for VPAGES region. |
app/boards/intel_adsp_ace15_mtpm.conf |
Increases virtual region count to 3 for VPAGES region. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
zephyr/lib/vregion.c
Outdated
| LOG_INF("new at base %p size %#x pages %d struct embedded at %p", | ||
| (void *)vr->base, total_size, pages, (void *)vr); | ||
| LOG_DBG(" interim size %#x at %p", interim_size, (void *)vr->interim.heap.heap.init_mem); | ||
| LOG_DBG(" lifetime size %#x at %p", lifetime_size, (void *)vr->lifetime.base); |
There was a problem hiding this comment.
The LOG_INF/LOG_DBG calls here print total_size/interim_size/lifetime_size using %#x, but these are size_t. On 64-bit builds this can truncate and can fail with -Wformat. Prefer %zu or %#zx for sizes.
zephyr/lib/vregion.c
Outdated
| LOG_DBG("destroy %p size %#x pages %d", (void *)vr->base, vr->size, vr->pages); | ||
| LOG_DBG(" lifetime used %d free count %d", vr->lifetime.used, vr->lifetime.free_count); |
There was a problem hiding this comment.
vregion_destroy() logs vr->size and vr->lifetime.used with %#x/%d, but these are size_t. Please update these format strings to the correct size_t specifiers to avoid -Wformat issues.
kv2019i
left a comment
There was a problem hiding this comment.
One staging related comment inline. Otherwise looks good and addition of a small test case is great.
zephyr/Kconfig
Outdated
| hex "Size in bytes of virtual memory region for virtual heap shared for all cores" | ||
| depends on MM_DRV_INTEL_ADSP_MTL_TLB | ||
| default 0x100000 | ||
| default 0xc0000 |
There was a problem hiding this comment.
Any downsides to this change when done now? I mean this is enable the feature in many builds by default, but the vregions are not actually used yet. I wonder if these Kconfig changes should be done in the PR when vregion is actually used in pipelines?
There was a problem hiding this comment.
@lyakh And test is run at boot...? Hmm, right, so maybe ok then. With my recent user-space PR, I've kept the new tests standalone so I don't have to enable new features in Kconfig by default yet. OTOH, the major downside is these tests are not run by any CI then....
There was a problem hiding this comment.
Most of the comments in this review were carried over from the previous review. Unfortunately, the code was moved to a new PR, which required re-reviewing the entire change again. Again, the vpage_ctx.velems is a large array (SOF_VPAGE_MAX_ALLOCS = 128). It would be worth optimizing add and remove operations to avoid repeatedly scanning it linearly. Really please consider moving page mapping logic into vregion. For lifetime allocators, mapping pages progressively as needed could help reduce memory usage.
zephyr/lib/vpages.c
Outdated
| vpage_ctx.num_elems_in_use = 0; | ||
|
|
||
| /* bundles are uint32_t of bits */ | ||
| bitmap_num_bundles = SOF_DIV_ROUND_UP(block_count, 32); |
There was a problem hiding this comment.
32 => sizeof(uint32_t)? Bellow you use sizeof(uint32_t).
zephyr/lib/vpages.c
Outdated
| * This allocator manages the allocation and deallocation of virtual memory pages from | ||
| * a predefined virtual memory region which is larger than the physical memory region. | ||
| * | ||
| * Both memory regions are divided into 4kB pages that are represented as blocks in a |
There was a problem hiding this comment.
should we really hard-specify the 4kB size?
zephyr/lib/vpages.c
Outdated
| for (i = 0; i < VPAGE_MAX_ALLOCS; i++) { | ||
| if (vpage_ctx.velems[i].pages == 0) { | ||
| vpage_ctx.velems[i].pages = pages; | ||
| vpage_ctx.velems[i].vpage = | ||
| (POINTER_TO_UINT(vaddr) - | ||
| POINTER_TO_UINT(vpage_ctx.virtual_region->addr)) / | ||
| CONFIG_MM_DRV_PAGE_SIZE; | ||
| vpage_ctx.num_elems_in_use++; | ||
| break; | ||
| } | ||
| } |
There was a problem hiding this comment.
To eliminate the loop during page allocation, I suggest:
vpage_ctx.velems[page_context.num_elems_in_use].pages = pages;
vpage_ctx.velems[page_context.num_elems_in_use].vpage =
(POINTER_TO_UINT(vaddr) -
POINTER_TO_UINT(vpage_ctx.virtual_region->addr)) /
CONFIG_MM_DRV_PAGE_SIZE;
vpage_ctx.num_elems_in_use++;Especially since this array is large by default: SOF_VPAGE_MAX_ALLOCS = 128
There was a problem hiding this comment.
...or use a list. This can be a later optimisation
There was a problem hiding this comment.
Lets assume SOF_VPAGE_MAX_ALLOCS = 8.
now imagine, that somebody allocates SOF_VPAGE_MAX_ALLOCS / 2 elements
velems[0] = alloc1
velems[1] = alloc2
velems[2] = alloc3
velems[3] = alloc4
velems[4] = free
velems[5] = free
velems[6] = free
velems[7] = freethen frees all but the last
Lets free alloc1:
velems[0] = alloc4 // Move last element to position of the element being released
velems[1] = alloc2
velems[2] = alloc3
velems[3] = free
velems[4] = free
velems[5] = free
velems[6] = free
velems[7] = freethen alloc2:
velems[0] = alloc4
velems[1] = alloc3 // Move last element to position of the element being released
velems[2] = free
velems[3] = free
velems[4] = free
velems[5] = free
velems[6] = free
velems[7] = freethen alloc3:
velems[0] = alloc4 // Move last element to position of the element being released
velems[1] = free
velems[2] = free
velems[3] = free
velems[4] = free
velems[5] = free
velems[6] = free
velems[7] = freeThen allocates another SOF_VPAGE_MAX_ALLOCS / 2 elements
velems[0] = alloc4
velems[1] = alloc5
velems[2] = alloc6
velems[3] = alloc7
velems[4] = alloc8
velems[5] = free
velems[6] = free
velems[7] = freeSo we have filled velems[0] .. velems[4]. What is the problem?
You always know the position of the last element (num_elems_in_use - 1) so you replace the released element with it and clear the last one.
There was a problem hiding this comment.
Alternatively, we could completely remove this table and rely on two bitarrays. One bitarray would mark that a given page is the first allocated page. The second one would encode allocation size - either by setting a bit on the last page of the allocated block or, inversely, by marking that the next page still belongs to the same allocation. This approach is currently used in vmh. Allocation size calculation there requires heavy optimization, but it is a good starting point to present the idea. With proper optimization, we could check continuity of 32 pages at once, and an ffs operation could return the index of the last page.
Lines 459 to 496 in 1f18231
There was a problem hiding this comment.
So we have filled
velems[0] .. velems[4]. What is the problem?
@softwarecki I understand your proposal, I didn't say, that there was a problem, I just suggested to make that a later optimisation.
There was a problem hiding this comment.
Now SOF_VPAGE_MAX_ALLOCS / 2 - 1 elements are free, but you'll fail any attempt to allocate memory again - as long as the last element is kept.
I didn't say, that there was a problem
👀 Okeyyyy...
Using a list doesn't bring a performance benefit compared to the current array-based approach but could consume more memory due to additional link pointers in each list element.
Optimizations for adding and removing elements from the array were already proposed in a previous review. They are straightforward to implement and can be applied immediately. I even provided example code in the comments. An optimization based on a bitarray does require more work and could reasonably be planned as a next step.
zephyr/lib/vpages.c
Outdated
| if (i == VPAGE_MAX_ALLOCS) { | ||
| /* | ||
| * The caller is holding a lock, so someone must have changed | ||
| * vpages state without taking it | ||
| */ | ||
| LOG_ERR("data corruption, check for races"); | ||
| return -EPROTO; | ||
| } |
There was a problem hiding this comment.
It looks like AI slop. We already check for allocation elements, so we don't expect to find no free entry in this array. Allocations are protected by mutex so there is no "someone".
| /* check for valid ptr which must be page aligned */ | ||
| if (!IS_ALIGNED(ptr, CONFIG_MM_DRV_PAGE_SIZE)) { | ||
| LOG_ERR("error: invalid non aligned page pointer %p", ptr); | ||
| return -EINVAL; |
There was a problem hiding this comment.
k_panic()? This seems like a serious problem and we should not allow it to leak memory.
There was a problem hiding this comment.
not here certainly. Anybody can call vpage_free() with a wrong pointer
There was a problem hiding this comment.
Passing a wrong pointer sounds like a serious problem...
There was a problem hiding this comment.
Passing a wrong pointer sounds like a serious problem...
in the calling code, yes. But not here. I think a panic is more for internal inconsistencies. Passing an invalid argument to a function doesn't sound like one to me. But I suppose we can find different cases when it's used in SOF. So, it isn't something I have a strong opinion about, but unless there are good arguments, I'd keep it this way.
zephyr/include/sof/lib/vregion.h
Outdated
| * Allocate aligned memory from the specified virtual region based on the memory type. | ||
| * | ||
| * @param[in] vr Pointer to the virtual region instance. | ||
| * @param[in] type Type of memory to allocate (static, dynamic, or shared static). |
zephyr/lib/vregion.c
Outdated
| }; | ||
|
|
||
| /* zephyr k_heap for interim allocations. TODO: make lockless for improved performance */ | ||
| struct zephyr_heap { |
There was a problem hiding this comment.
zephyr_heap? A vague and misleading name. It suggests a Zephyr origin.
zephyr/lib/vregion.c
Outdated
| * information about the base address, size, and allocation status | ||
| * of the region. | ||
| * | ||
| * Currently the virtual region memory region can be partitioned into two areas |
There was a problem hiding this comment.
one "region" too many?
zephyr/lib/vregion.c
Outdated
| if (align == CONFIG_DCACHE_LINE_SIZE) | ||
| size = ALIGN_UP(size, CONFIG_DCACHE_LINE_SIZE); | ||
|
|
||
| /* calculate new heap object size for object and alignments */ |
There was a problem hiding this comment.
maybe the comment could be clarified?
| return; | ||
| } | ||
|
|
||
| LOG_ERR("error: vregion free invalid pointer %p", ptr); |
There was a problem hiding this comment.
that would be a DOS invitation
There was a problem hiding this comment.
To execute DOS, all is need for a userspace thread is to reference an invalid address or divide by 0, and the firmware crashes. There's no exception recovery in Zephyr. Such panic will allow to immediately detect a bug instead of slowly losing available memory.
There was a problem hiding this comment.
Actually @softwarecki that's not correct. Zephyr can recover, but we have just not implemented this yet in SOF. See https://docs.zephyrproject.org/latest/kernel/services/other/fatal.html#fatal-error-handling
I do agree the primary goal is to protect memory and data. Ability to recover from misbehaving user-space code is an additional step, but Zephyr does provide support to implement this as well.
There was a problem hiding this comment.
Ok, so lets use assert to detect it in debug builds.
There was a problem hiding this comment.
Ok, so lets use assert to detect it in debug builds.
ok, I checked sys_heap_free() in Zephyr and it does use __ASSERT() when checking (some) invalid addresses, so we could do something similar. But what should it do if assertions are disabled? Just proceed and crash somewhere randomly later? Or use both an assertion and a return? I think we can use CHECKIF() and enable CONFIG_ASSERT_ON_ERRORS in our debug overlay
There was a problem hiding this comment.
But what should it do if assertions are disabled? Just proceed and crash somewhere randomly later?
At the moment in the code you proposed there is only LOG_ERR, so that's exactly what's happening.
Or use both an assertion and a return?
Yes, we can use __ASSERT() and LOG_ERR(). What return?
Add a simple virtual page allocator that uses the Zephyr memory mapping infrastructure to allocate pages from a virtual region and map them to physical pages. Due to simplicity, the number of allocations is limited to CONFIG_SOF_VPAGE_MAX_ALLOCS Signed-off-by: Liam Girdwood <liam.r.girdwood@linux.intel.com> Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
Add support for per pipeline and per module virtual memory regions. The intention is to provide a single virtual memory region per pipeline or per DP module that can simplify module/pipeline memory management. The region takes advantage of the way pipeline and modules are constructed, destroyed and used during their lifetimes. 1) memory tracking - 1 pointer/size per pipeline or DP module. 2) memory read/write/execute permissions 3) cache utilization. Modules and pipelines will allocate from their region only and this will be abstracted via the allocation APIs. Signed-off-by: Liam Girdwood <liam.r.girdwood@linux.intel.com> Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
Add boot-time tests for basic vpage and vregion functionality. Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
This is a slightly simplified version of #10281 with an added test